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1.  Introduction. 

In  this  paper  we  are  concerned  with  the  existence 
of  optimal  controls  for  problems  of  the  following 
kind.  Let  X^.  denote  the  process  which  we  wish  to 

control,  Y^  the  observation  process  and  U  the 

control  process,  0  <  t  <  T,  with  T  fixed.  The 
state  and  observation  processes  are  governed  by 
stochastic  differential  equations 

(a)  dX.  =  b(t,X  ,Y  ,U  )dt  +  cr(t,X„  ,Y.  )dW. 

t  t  t  t  t  t  t  ^  ^ 

(b)  dYt  =  h(t,X  )dt  +  dW  . 

X  has  values  in  N-dimensional  RN,  Y  values  in 
t  •  t 

r\  and  U  values  in  “2?  c  R  .  [Only  some  notational 

complications  are  involved  if  vector-valued  ob¬ 
servations  Y  are  considered.]  X„  has  given 

L  0 

distribution,  with  density  pQ(x),  and  YQ  =  0.  In 
(1.1),  W  and  W  are  independent  Wiener  processes. 

The  problem  is  to  minimize  a  criterion  of  the 


F(t,Xt,Ut)dt  +  GtX^,) 


It  is  customary  to  require  that 


be  measurable 


with  respect  to  the  o-algebra  generated  by  observa¬ 
tions  Y  ,  0  <  s  <  t.  We  call  this  the  strict  sense 

s  —  -  - 

version  of  the  problem.  For  several  years  the 
I  question  of  proving  a  general  theorem  about  existence 
of  optimal  controls  in  the  strict  sense  has  been  open. 
We  do  not  obtain  such  a  result  here.  However,  we 
obtain  an  existence  theorem  in  which  a  somewhat  wider 
class  of  control  processes  is  admitted.  Roughly 
speaking,  this  wider  class  of  controls  is  obtained  as 
follows .  Let 


exp  f  h(s,X  )dY  -if  h2(s,X  )ds  . 

_J_0  s  s  2  Jq  S 


In  the  wide 


Then  W  ,Y  are  independent  Wiener  processes  under 
^  *  o 

a  new  probability  measure  p  Qrelated  to  the  original 
probability  measure  P  by  -ZT>  In  the  wide 

sense  formulation  we  wish  to  require  merely  that  Ug 

for  s  <  t  be  independent  of  future  increments 
Yf  -  Yp  for  t  <  p  <  r  with  respect  to  P.  In 

52  we  give  a  precise  formulation  of  this  idea,  in 
which  we  define  the  control  as  the  Joint  distribution 
measure  of  the  processes  Y,U. 

Our  method  depends  on  introducing  another 
stochastic  control  problem,  which  we  call  a 
"separated"  problem.  This  separated  problem  is 


equivalent  to  the  one  formulated  in  52.  In  the 
separated  problem  the  "state"  p(t,-)  at  time  t  is  a 
function  obeying  a  linear,  parabolic  partial  differ¬ 
ential  equation  (3.M.  The  coefficients  of  (3-1*) 
depend  on  the  observations  Y^  and  controls  U^, 

0  <  t  <  T.  The  solution  p(t,x)  is  related  in  a 
simple  way  to  the  unnormalized  conditional  density 

q(t,x)  of  X. ,  given  observations  Y  and  controls 
t  s 

Ug  for  s  <  t.  See  (3-6).  The  proof  of  this  fact 

makes  use  of  probabilistic  solutions  to  a  "backward" 
partial  differential  equation  adjoint  to  the 
"forward"  equation  (3.U),  an  idea  already  exploited 
in  [3]  for  the  nonlinear  filter  problem.  However, 
unlike  [3]  we  work  with  (3. U)  instead  of  the  Zakai 
equation  (3-7)  for  q.  In  this  way,  Ito  stochastic 

integrals  and  results  about  stochastic  PDE's  are 
avoided  in  the  analysis.  For  the  nonlinear  filter 
problem,  equation  was  derived  by  Davis  [l]. 

2,  Formulation  of  the  problem. 

We  make  the  following  assumptions  about  the  func¬ 
tions  b,a,h  in  (l.l). 

(A^)  a  and  its  partial  derivatives  3o/3Xj, 

J  =  1,...,N,  are  bounded,  continuous  functions  of 
(t,x,y).  Moreover,  a  has  an  inverse  cr1,  which  is  a 
bounded  function  of  (t,x,y). 

(A  )  b(t,x,y,u)  =  b°(t,x,y)  +  ub^tt.x.y) ,  where 

b^  and  b^-  are  bounded,  continuous  functions  of 
(t,x,y). 

(A^)  h.Sh/St.Bh/Sx^.S^/Sx^Xj ,  i,  j  =  1,...,N 
axe  bounded,  continuous  functions. 

We  also  assume: 

<V  ^  is  a  convex,  compact  subset  of  RV. 

(Aj)  The  density  pQ(x)  of  XQ  is  in  L2(RN), 


i*r 


Pq(x)cIx  <  00  for  some  l  >  1. 


We  formulate  the  problem  on  the  "canonical" 
sample  space 

n  =  C([0,T];RN)  X  CdO.ThR1)  *  L2(  [0,T] ;  <&) , 

whose  elements  w  satisfy 

w(t)  =  (Xt(«i),Yt(<u),Ut(«i)),  0  <  t  <  T. 

We  give  C([0,T];R^)  for  d  =  1,N  the  usual  norm 
topology;  and  we  give  L2(  [0,T] ; ‘2r)  the  weak 
topology,  which  is  metrizable  since  *2^  is  compact. 
We  consider  the  following  increasing  families  of 
o-algebras : 
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5*j.  =  a{xs,  s  <  t} 


\  3  t  | 


^  v 


Definition.  An  admissible  control  is  a  probability 
measure  11  on  (ft,  i^,)  such  that  Y  is  a  if, 
Wiener  process. 

Let  21  denote  the  set  of  all  admissible  controls 
ff.  Each  ff  €  3  determines  the  joint  distribution 
measure  of  (X,Y,U)  as  follows.  Given 

Y  €C(  [0,T]  iR1)  and  U  £  L2(  [0,T] ;  <2f)  let  PY,Ube 
the  unique  probability  measure  on  (ft,  5L)  such  that 
—YU  T 

P  ’  is  the  solution  to  the  martingale  problem  [6] 

associated  with  (l.l)(a),  and 
PY,U(X0€B)  =  |  pQ(x)dx 

for  all  Bor el  B  c  R^.  Let 

Pj.(dX,dY,dU)  =  PY’U(dX)ir(dY,dU), 
and  define  P^  by 


J(n)  among  strict  sense  admissible  controls;  but  it 
has  not  been  shown  that  a  strict  sense  optimal  con¬ 
trol  exists.  By  admitting  wider  sense  controls  tea, 
we  in  effect  allow  the  control  U^_  to  depend  on 

auxiliary  randomizations  in  addition  to  the  observa¬ 
tions  Y  for  s  <  t . 
s 

3.  The  filtering  equations. 

Given  trajectories  Y  and  U  for  the  observation 
and  control  processes,  consider  the  elliptic  partial 
differential  operators  associated  with  (l.l)(a): 


Lt  =  i  i 


i,j=i iJ  3xi3xj 


+  b-V, 


where  a  =  OCT  and  V  is  the  gradient  in  x.  Let 


L  =  L.  -  Y.  I  l  a.  ,  -p-  ~~  , 

*  t  t  i=l[j=l  lJ  3xjJ  3xi 


e(t,x)  =  |  Y^(aVh,7h)  -  Yt(||  +  Lth)  -  \  .  (3-3) 

Let  p(t,x)  be  the  unique  solution  in 

L2([0,T];H1)  G  C([0,T];Rn)  to  the  partial  differential 
equation 


with  Z^  as  in  (1.3).  It  can  be  shown  that  there 
exist  independent  Pn  Wiener  processes  W  and  W 
such  that  (l.l)  holds  P^-almost  surely. 

□ 

Let  us  write  E^E^  for  expectations  with  respect 
to  P-jpEj!  respectively.  Then  (1.2)  becomes 

J(ff)  =  E^jj  F(t,Xt,Ut)dt  +  G(Xt)|>.  (2.2) 

We  make  the  following  assumptions  about  F  and  G. 

( A6)  F,G  are  measurable.  For  fixed  (t,x), 

F(  t  ,x ,  • )  is  continuous  and  convex  on  For  some 

C,  m  >  0, 

0  <  F(t,x,u)  <  C(l+|x| )“ 

0  <  G(x)  <  C(l+|x| )m. 

In  (A,)  we  take  l  >  m,  which  implies  that 
J(lf)  <  OO.3 

Our  result  about  existence  of  an  optimal  control 


Theorem.  There  exists  n*  £  21  such  that  J(n*)  < 

j(ff}  Tor  all  t  €  t. 

In  §'s  3,1*  we  indicate  the  method  of  proof.  A 
detailed  proof  will  be  given  elsewhere. 

The  projection  of  any  if  €  21  under  (Y,U)  ■*  Y 
i3  Wiener  measure  u  on  C( [0,T] ;R^) .  Let 

Y 

if  (dU)  be  a  regular  conditional  distribution  for  U 
given  Y.  We  call  if  admissible  in  the  strict  sense 
if  if  €  21  and  if*  is  a  Dirac  measure,  concentrated 
at  a  point  U(Y)  €  L2(  [0,T]  ;‘2?) ,  (j-almost  surely.  It 
can  be  shown  that  J(lf*)  equals  the  infimum  of 


=  (Lt)  P  +  e(t,x)p 


with  p(0,x)  =  Pg(x'; .  The  following  key  formula  can  be 

proved.  Given  If  €.  21,  then  for  every  bounded  con¬ 

tinuous  f 

|  p(t,x)exp[Yth(t,x)]f(x)dx  =  E^f  (Xt  )zj  ]  .  (3-5) 


if-almost  surely.  The  proof  involves  the  backward 
partial  differential  equation  adjoint  to  (3-1*),  to 
whose  solutions  an  appropriate  version  of  the  Feynman- 
Kac  formula  is  applied. 


q(t,x)  =  p(t,x)exp[Yth(t,x)] .  (3.6) 

Equation  (3-5)  implies  that  q(t,x)  is  the  unnormal¬ 
ized  conditional  density  of  X^  given  <?t  (in  other 

words,  given  past  observations  and  controls  Y  ,  U 

s  s 

for  s  <  t.)  It  can  be  shown  that  q  satisfies  the 
Zakai  equation 


it =  (Lt)#cl  +  hqdYt 


with  q(0,x)  =  pQ(x).  The  conditional  density  of  X^ 
given  is 


q(t,x)  = 


qdx 


U.  A  separated  control  problem. 

A  well  known  idea  is  to  introduce  the  conditional 
distribution  of  a  partially  observed  state  X  as  the 
"state"  in  a  new  "separated"  control  problem.  This 
idea  is  the  key  to  the  classical  senaration  nrincinle 


I 


for  linear-quadratic  problems  [2,  Chap.  VT.ll], 
Similar  ideas  occur  in  the  control  of  partially  ob¬ 
served  Markov  chains  [5]  and  of  jump  processes  [4]. 

In  the  present  context,  we  may  take  p(t,-)  as 
the  state  at  time  t  in  a  separated  problem,  since  the 
conditional  distributions  of  X  are  determined  from 

p(t,-)  through  (3.6)  and  (3.3).  The  dynamics  of  the 
state  process  in  the  separated  control  problem  are 
(3-M.  Both  e  and  the  coefficients  of 

depend  on  trajectories  Y  and  U  for  the  ob¬ 
servation  and  control  processes.  Let 

0  =  Cao.ThR1)  x  L2([0,T];9r). 

For  each  (Y,U)  €  ft,  p  =  p  ’  is  the  unique  solution 
to  (3.4)  with  the  given  initial  data  p(0,x)  =  p  (x). 

In  (3.6)  we  also  write  q  =  qY,U  for  the  unnormal¬ 
ized  conditional  density.  From  (2.1)  and  elementary 
properties  of  conditional  expectations  with  respect 
to  ^  and  (2.2)  can  be  rewritten  as 


[5]  A.  Segall,  Optimal  control  of  noisy  finite 
state  Markov  processes,  I EEL  Trans,  on  Auto. 
Control,  AC-22 (1977),  179-186. 

[6]  D.W.  Stroock  and  S.R.S.  Varadhan,  Multi¬ 
dimensional  Diffusion  Processes,  Springer-Verlag, 
1979. 


J(ff)  =  [  f  I  F(t,x,U  )q',U(t,x)dxdt 

Jftl±°V 

+  f  G(x)qY,U(T,x)dx  d". 

V  J 


The  separated  problem  is  to  show  that  there  exists 

tga  minimizing  (4.1).  Once  this  is  shown,  the 
I  Theorem  in  §2  follows  immediately. 

The  proof  of  existence  of  v*  proceeds  as 

follows.  Let  11  be  any  minimizing  sequence  in  8. 
n 

The  sequence  of  probability  measures  ^  is  tight, 

and  hence  a  subsequence  converges  weakly  to  a  limit 

•  * 

it  .  Moreover,  T  £  8.  Finally,  it  is  shown  that 

J(ir#)  <  lim  J(F  ); 

"  b—  “ 

the  proof  depends  on  linearity  of  b  and  convexity 

of  F  in  the  control  variable  u  (see  assumptions 

(A„),  (A,)  in  §2.)  as  well  a3  results  from  PDE  about 
2  w  yu 

continuous  dependence  on  Y,U  of  solutions  p  ’  to 
1  (3.4). 
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